152 research outputs found

    Modelling person-specific and multi-scale facial dynamics for automatic personality and depression analysis

    Get PDF
    ‘To know oneself is true progress’. While one's identity is difficult to be fully described, a key part of it is one’s personality. Accurately understanding personality can benefit various aspects of human's life. There is convergent evidence suggesting that personality traits are marked by non-verbal facial expressions of emotions, which in theory means that automatic personality assessment is possible from facial behaviours. Thus, this thesis aims to develop video-based automatic personality analysis approaches. Specifically, two video-level dynamic facial behaviour representations are proposed for automatic personality traits estimation, namely person-specific representation and spectral representation, which focus on addressing three issues that have been frequently occurred in existing automatic personality analysis approaches: 1. attempting to use super short video segments or even a single frame to infer personality traits; 2. lack of proper way to retain multi-scale long-term temporal information; 3. lack of methods to encode person-specific facial dynamics that are relatively stable over time but differ across individuals. This thesis starts with extending the dynamic image algorithm to modeling preceding and succeeding short-term face dynamics of each frame in a video, which achieved good performance in estimating valence/arousal intensities, showing good dynamic encoding ability of such dynamic representation. This thesis then proposes a novel Rank Loss, aiming to train a network that produces similar dynamic representation per-frame but only from a still image. This way, the network can learn generic facial dynamics from unlabelled face videos in a self-supervised manner. Based on such an approach, the person-specific representation encoding approach is proposed. It firstly freezes the well-trained generic network, and incorporates a set of intermediate filters, which are trained again but with only person-specific videos based on the same self-supervised learning approach. As a result, the learned filters' weights are person-specific, and can be concatenated as a 1-D video-level person-specific representation. Meanwhile, this thesis also proposes a spectral analysis approach to retain multi-scale video-level facial dynamics. This approach uses automatically detected human behaviour primitives as the low-dimensional descriptor for each frame, and converts long and variable-length time-series behaviour signals to small and length-independent spectral representations to represent video-level multi-scale temporal dynamics of expressive behaviours. Consequently, the combination of two representations, which contains not only multi-scale video-level facial dynamics but also person-specific video-level facial dynamics, can be applied to automatic personality estimation. This thesis conducts a series of experiments to validate the proposed approaches: 1. the arousal/valence intensity estimation is conducted on both a controlled face video dataset (SEMAINE) and a wild face video dataset (Affwild-2), to evaluate the dynamic encoding capability of the proposed Rank Loss; 2. the proposed automatic personality traits recognition systems (spectral representation and person-specific representation) are evaluated on face video datasets that labelled with either 'Big-Five' apparent personality traits (ChaLearn) or self-reported personality traits (VHQ); 3. the depression studies are also evaluated on the VHQ dataset that is labelled with PHQ-9 depression scores. The experimental results on automatic personality traits and depression severity estimation tasks show the person-specific representation's good performance in personality task and spectral vector's superior performance in depression task. In particular, the proposed person-specific approach achieved a similar performance to the state-of-the-art method in apparent personality traits recognition task and achieved at least 15% PCC improvements over other approaches in self-reported personality traits recognition task. Meanwhile, the proposed spectral representation shows better performance than the person-specific approach in depression severity estimation task. In addition, this thesis also found that adding personality traits labels/predictions into behaviour descriptors improved depression severity estimation results

    Identification of Two Thermotolerance-Related Genes in Agaricus bisporus

    Get PDF
    To characterize thermotolerance-related genes in Agaricus bisporus strain 02, we employed differential display PCR (DD-PCR) to analyze total RNA samples extracted from the mycelia grown at different temperatures. Two partial DNA fragments (023-11A and 023-11B) were cloned thus far, the expression of which was correlated with the culturing temperature. The sequences of the two DNA fragments were determined and the results showed that the nucleotide sequence of 023-11A was unknown, and 023-11B was highly similar in nucleotide sequence (identities 24 %, positives 45 %) to a gene coding for the karyopherin docking complex of the nuclear pore complex of Saccharomyces cerevisiae. It is possible to use the two fragments for further characterization of full-length coding sequences, which can potentially be used for generating new thermotolerant mushroom strains by transgenic technique

    Identification of Two Thermotolerance-Related Genes in Agaricus bisporus

    Get PDF
    To characterize thermotolerance-related genes in Agaricus bisporus strain 02, we employed differential display PCR (DD-PCR) to analyze total RNA samples extracted from the mycelia grown at different temperatures. Two partial DNA fragments (023-11A and 023-11B) were cloned thus far, the expression of which was correlated with the culturing temperature. The sequences of the two DNA fragments were determined and the results showed that the nucleotide sequence of 023-11A was unknown, and 023-11B was highly similar in nucleotide sequence (identities 24 %, positives 45 %) to a gene coding for the karyopherin docking complex of the nuclear pore complex of Saccharomyces cerevisiae. It is possible to use the two fragments for further characterization of full-length coding sequences, which can potentially be used for generating new thermotolerant mushroom strains by transgenic technique

    Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features

    Get PDF
    Depression is a serious mental disorder that affects millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and need extensive participation of experts. Audio-visual automatic depression analysis systems predominantly base their predictions on very brief sequential segments, sometimes as little as one frame. Such data contains much redundant information, causes a high computational load, and negatively affects the detection accuracy. Final decision making at the sequence level is then based on the fusion of frame or segment level predictions. However, this approach loses longer term behavioural correlations, as the behaviours themselves are abstracted away by the frame-level predictions. We propose to on the one hand use automatically detected human behaviour primitives such as Gaze directions, Facial action units (AU), etc. as low-dimensional multi-channel time series data, which can then be used to create two sequence descriptors. The first calculates the sequence-level statistics of the behaviour primitives and the second casts the problem as a Convolutional Neural Network problem operating on a spectral representation of the multichannel behaviour signals. The results of depression detection (binary classification) and severity estimation (regression) experiments conducted on the AVEC 2016 DAIC-WOZ database show that both methods achieved significant improvement compared to the previous state of the art in terms of the depression severity estimation

    Modelling person-specific and multi-scale facial dynamics for automatic personality and depression analysis

    Get PDF
    ‘To know oneself is true progress’. While one's identity is difficult to be fully described, a key part of it is one’s personality. Accurately understanding personality can benefit various aspects of human's life. There is convergent evidence suggesting that personality traits are marked by non-verbal facial expressions of emotions, which in theory means that automatic personality assessment is possible from facial behaviours. Thus, this thesis aims to develop video-based automatic personality analysis approaches. Specifically, two video-level dynamic facial behaviour representations are proposed for automatic personality traits estimation, namely person-specific representation and spectral representation, which focus on addressing three issues that have been frequently occurred in existing automatic personality analysis approaches: 1. attempting to use super short video segments or even a single frame to infer personality traits; 2. lack of proper way to retain multi-scale long-term temporal information; 3. lack of methods to encode person-specific facial dynamics that are relatively stable over time but differ across individuals. This thesis starts with extending the dynamic image algorithm to modeling preceding and succeeding short-term face dynamics of each frame in a video, which achieved good performance in estimating valence/arousal intensities, showing good dynamic encoding ability of such dynamic representation. This thesis then proposes a novel Rank Loss, aiming to train a network that produces similar dynamic representation per-frame but only from a still image. This way, the network can learn generic facial dynamics from unlabelled face videos in a self-supervised manner. Based on such an approach, the person-specific representation encoding approach is proposed. It firstly freezes the well-trained generic network, and incorporates a set of intermediate filters, which are trained again but with only person-specific videos based on the same self-supervised learning approach. As a result, the learned filters' weights are person-specific, and can be concatenated as a 1-D video-level person-specific representation. Meanwhile, this thesis also proposes a spectral analysis approach to retain multi-scale video-level facial dynamics. This approach uses automatically detected human behaviour primitives as the low-dimensional descriptor for each frame, and converts long and variable-length time-series behaviour signals to small and length-independent spectral representations to represent video-level multi-scale temporal dynamics of expressive behaviours. Consequently, the combination of two representations, which contains not only multi-scale video-level facial dynamics but also person-specific video-level facial dynamics, can be applied to automatic personality estimation. This thesis conducts a series of experiments to validate the proposed approaches: 1. the arousal/valence intensity estimation is conducted on both a controlled face video dataset (SEMAINE) and a wild face video dataset (Affwild-2), to evaluate the dynamic encoding capability of the proposed Rank Loss; 2. the proposed automatic personality traits recognition systems (spectral representation and person-specific representation) are evaluated on face video datasets that labelled with either 'Big-Five' apparent personality traits (ChaLearn) or self-reported personality traits (VHQ); 3. the depression studies are also evaluated on the VHQ dataset that is labelled with PHQ-9 depression scores. The experimental results on automatic personality traits and depression severity estimation tasks show the person-specific representation's good performance in personality task and spectral vector's superior performance in depression task. In particular, the proposed person-specific approach achieved a similar performance to the state-of-the-art method in apparent personality traits recognition task and achieved at least 15% PCC improvements over other approaches in self-reported personality traits recognition task. Meanwhile, the proposed spectral representation shows better performance than the person-specific approach in depression severity estimation task. In addition, this thesis also found that adding personality traits labels/predictions into behaviour descriptors improved depression severity estimation results

    Spectral Representation of Behaviour Primitives for Depression Analysis

    Get PDF

    Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification

    Full text link
    Most deep learning-based acoustic scene classification (ASC) approaches identify scenes based on acoustic features converted from audio clips containing mixed information entangled by polyphonic audio events (AEs). However, these approaches have difficulties in explaining what cues they use to identify scenes. This paper conducts the first study on disclosing the relationship between real-life acoustic scenes and semantic embeddings from the most relevant AEs. Specifically, we propose an event-relational graph representation learning (ERGL) framework for ASC to classify scenes, and simultaneously answer clearly and straightly which cues are used in classifying. In the event-relational graph, embeddings of each event are treated as nodes, while relationship cues derived from each pair of nodes are described by multi-dimensional edge features. Experiments on a real-life ASC dataset show that the proposed ERGL achieves competitive performance on ASC by learning embeddings of only a limited number of AEs. The results show the feasibility of recognizing diverse acoustic scenes based on the audio event-relational graph. Visualizations of graph representations learned by ERGL are available here (https://github.com/Yuanbo2020/ERGL).Comment: IEEE Signal Processing Letters, doi: 10.1109/LSP.2023.331923

    Multiple Appropriate Facial Reaction Generation in Dyadic Interaction Settings: What, Why and How?

    Full text link
    According to the Stimulus Organism Response (SOR) theory, all human behavioral reactions are stimulated by context, where people will process the received stimulus and produce an appropriate reaction. This implies that in a specific context for a given input stimulus, a person can react differently according to their internal state and other contextual factors. Analogously, in dyadic interactions, humans communicate using verbal and nonverbal cues, where a broad spectrum of listeners' non-verbal reactions might be appropriate for responding to a specific speaker behaviour. There already exists a body of work that investigated the problem of automatically generating an appropriate reaction for a given input. However, none attempted to automatically generate multiple appropriate reactions in the context of dyadic interactions and evaluate the appropriateness of those reactions using objective measures. This paper starts by defining the facial Multiple Appropriate Reaction Generation (fMARG) task for the first time in the literature and proposes a new set of objective evaluation metrics to evaluate the appropriateness of the generated reactions. The paper subsequently introduces a framework to predict, generate, and evaluate multiple appropriate facial reactions
    corecore